54 research outputs found
Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation
Many real-world manipulation tasks consist of a series of subtasks that are
significantly different from one another. Such long-horizon, complex tasks
highlight the potential of dexterous hands, which possess adaptability and
versatility, capable of seamlessly transitioning between different modes of
functionality without the need for re-grasping or external tools. However, the
challenges arise due to the high-dimensional action space of dexterous hand and
complex compositional dynamics of the long-horizon tasks. We present Sequential
Dexterity, a general system based on reinforcement learning (RL) that chains
multiple dexterous policies for achieving long-horizon task goals. The core of
the system is a transition feasibility function that progressively finetunes
the sub-policies for enhancing chaining success rate, while also enables
autonomous policy-switching for recovery from failures and bypassing redundant
stages. Despite being trained only in simulation with a few task objects, our
system demonstrates generalization capability to novel object shapes and is
able to zero-shot transfer to a real-world robot equipped with a dexterous
hand. More details and video results could be found at
https://sequential-dexterity.github.ioComment: CoRL 202
Spiking PointNet: Spiking Neural Networks for Point Clouds
Recently, Spiking Neural Networks (SNNs), enjoying extreme energy efficiency,
have drawn much research attention on 2D visual recognition and shown gradually
increasing application potential. However, it still remains underexplored
whether SNNs can be generalized to 3D recognition. To this end, we present
Spiking PointNet in the paper, the first spiking neural model for efficient
deep learning on point clouds. We discover that the two huge obstacles limiting
the application of SNNs in point clouds are: the intrinsic optimization
obstacle of SNNs that impedes the training of a big spiking model with large
time steps, and the expensive memory and computation cost of PointNet that
makes training a big spiking point model unrealistic. To solve the problems
simultaneously, we present a trained-less but learning-more paradigm for
Spiking PointNet with theoretical justifications and in-depth experimental
analysis. In specific, our Spiking PointNet is trained with only a single time
step but can obtain better performance with multiple time steps inference,
compared to the one trained directly with multiple time steps. We conduct
various experiments on ModelNet10, ModelNet40 to demonstrate the effectiveness
of Spiking PointNet. Notably, our Spiking PointNet even can outperform its ANN
counterpart, which is rare in the SNN field thus providing a potential research
direction for the following work. Moreover, Spiking PointNet shows impressive
speedup and storage saving in the training phase.Comment: Accepted by NeurIP
Learning a Universal Human Prior for Dexterous Manipulation from Human Preference
Generating human-like behavior on robots is a great challenge especially in
dexterous manipulation tasks with robotic hands. Even in simulation with no
sample constraints, scripting controllers is intractable due to high degrees of
freedom, and manual reward engineering can also be hard and lead to
non-realistic motions. Leveraging the recent progress on Reinforcement Learning
from Human Feedback (RLHF), we propose a framework to learn a universal human
prior using direct human preference feedback over videos, for efficiently
tuning the RL policy on 20 dual-hand robot manipulation tasks in simulation,
without a single human demonstration. One task-agnostic reward model is trained
through iteratively generating diverse polices and collecting human preference
over the trajectories; it is then applied for regularizing the behavior of
polices in the fine-tuning stage. Our method empirically demonstrates more
human-like behaviors on robot hands in diverse tasks including even unseen
tasks, indicating its generalization capability
Dynamic Handover: Throw and Catch with Bimanual Hands
Humans throw and catch objects all the time. However, such a seemingly common
skill introduces a lot of challenges for robots to achieve: The robots need to
operate such dynamic actions at high-speed, collaborate precisely, and interact
with diverse objects. In this paper, we design a system with two multi-finger
hands attached to robot arms to solve this problem. We train our system using
Multi-Agent Reinforcement Learning in simulation and perform Sim2Real transfer
to deploy on the real robots. To overcome the Sim2Real gap, we provide multiple
novel algorithm designs including learning a trajectory prediction model for
the object. Such a model can help the robot catcher has a real-time estimation
of where the object will be heading, and then react accordingly. We conduct our
experiments with multiple objects in the real-world system, and show
significant improvements over multiple baselines. Our project page is available
at \url{https://binghao-huang.github.io/dynamic_handover/}.Comment: Accepted at CoRL 2023.
https://binghao-huang.github.io/dynamic_handover
Membrane Potential Batch Normalization for Spiking Neural Networks
As one of the energy-efficient alternatives of conventional neural networks
(CNNs), spiking neural networks (SNNs) have gained more and more interest
recently. To train the deep models, some effective batch normalization (BN)
techniques are proposed in SNNs. All these BNs are suggested to be used after
the convolution layer as usually doing in CNNs. However, the spiking neuron is
much more complex with the spatio-temporal dynamics. The regulated data flow
after the BN layer will be disturbed again by the membrane potential updating
operation before the firing function, i.e., the nonlinear activation.
Therefore, we advocate adding another BN layer before the firing function to
normalize the membrane potential again, called MPBN. To eliminate the induced
time cost of MPBN, we also propose a training-inference-decoupled
re-parameterization technique to fold the trained MPBN into the firing
threshold. With the re-parameterization technique, the MPBN will not introduce
any extra time burden in the inference. Furthermore, the MPBN can also adopt
the element-wised form, while these BNs after the convolution layer can only
use the channel-wised form. Experimental results show that the proposed MPBN
performs well on both popular non-spiking static and neuromorphic datasets. Our
code is open-sourced at \href{https://github.com/yfguo91/MPBN}{MPBN}.Comment: Accepted by ICCV202
RMP-Loss: Regularizing Membrane Potential Distribution for Spiking Neural Networks
Spiking Neural Networks (SNNs) as one of the biology-inspired models have
received much attention recently. It can significantly reduce energy
consumption since they quantize the real-valued membrane potentials to 0/1
spikes to transmit information thus the multiplications of activations and
weights can be replaced by additions when implemented on hardware. However,
this quantization mechanism will inevitably introduce quantization error, thus
causing catastrophic information loss. To address the quantization error
problem, we propose a regularizing membrane potential loss (RMP-Loss) to adjust
the distribution which is directly related to quantization error to a range
close to the spikes. Our method is extremely simple to implement and
straightforward to train an SNN. Furthermore, it is shown to consistently
outperform previous state-of-the-art methods over different network
architectures and datasets.Comment: Accepted by ICCV202
On the Circular Polarisation of Repeating Fast Radio Bursts
Fast spinning (e.g., sub-second) neutron star with ultra-strong magnetic
fields (or so-called magnetar) is one of the promising origins of repeating
fast radio bursts (FRBs). Here we discuss circularly polarised emissions
produced by propagation effects in the magnetosphere of fast spinning
magnetars. We argue that the polarisation-limiting region is well beyond the
light cylinder, suggesting that wave mode coupling effects are unlikely to
produce strong circular polarisation for fast spinning magnetars. Cyclotron
absorption could be significant if the secondary plasma density is high.
However, high degrees of circular polarisation can only be produced with large
asymmetries in electrons and positrons. We draw attention to the non-detection
of circular polarisation in current observations of known repeating FRBs. We
suggest that the circular polarisation of FRBs could provide key information on
their origins and help distinguish different radiation mechanisms.Comment: ApJ accepte
Clinical comparison of percutaneous transforaminal endoscopic discectomy and unilateral biportal endoscopic discectomy for single-level lumbar disc herniation
PurposeTo compare the clinical outcomes of percutaneous transforaminal endoscopic discectomy (PTED) and unilateral biportal endoscopic discectomy (UBE) for the treatment of single-level lumbar disc herniation (LDH).Materials and methodsFrom January 2020 to November 2021, 62 patients with single-level LDH were retrospectively reviewed. All patients underwent spinal surgeries at the Affiliated Hospital of Chengde Medical University and Beijing Tongren Hospital, Capital Medical University. Among them, 30 patients were treated with UBE, and 32 were treated with PTED. The patients were followed up for at least one year. Patient demographics and perioperative outcomes were reviewed before and after surgery. The Oswestry Disability Index (ODI), visual analog scale (VAS) for back pain and leg pain, and modified MacNab criteria were used to evaluate the clinical outcomes. x-ray examinations were performed one year after surgery to assess the stability of the lumbar spine.ResultsThe mean ages in the UBE and PTED groups were 46.7 years and 48.0 years, respectively. Compared to the UBE group, the PTED group had better VAS scores for back pain at 1 and 7 days after surgery (3.06 ± 0.80 vs. 4.03 ± 0.81, P < 0.05; 2.81 ± 0.60 vs. 3.70 ± 0.79, P < 0.05). The UBE and PTED groups demonstrated significant improvements in the VAS score for leg pain and ODI score, and no significant differences were found between the groups at any time after the first month (P > 0.05). Although the good-to-excellent rate of the modified MacNab criteria in the UBE group was similar to that in the PTED group (86.7% vs. 87.5%, P > 0.05), PTED was advantageous in terms of the operation time, estimated blood loss, incision length, and length of postoperative hospital stay.ConclusionsBoth UBE and PTED have favorable outcomes in patients with single-level LDH. However, PTED is superior to UBE in terms of short-term postoperative back pain relief and perioperative quality of life
- …